measurement
Measurement is the principled assignment of numbers to qualitities. And no matter what you measure or how you measure it, you’ll have some error.
Some tools tend to produce very little error (e.g., the length of this table in in inches). Other tools tend to produce more error. In the social sciences, our job is made harder by the fact that we often measure qualities that do not have an objective physical reality, like happiness or well-being. Measurement error is exacerbated when there is little available data.
Unfortunately, our statistic models will assume your measure has no error… unless you can tell the model how much error there is.
In measurement theory, we may assume that
\[ X_i = T_i + \epsilon_i \]
Meaning that for any observation \(i\), the observed score \(X\) on some measure is the sum of the true score \(T\) and error \(\epsilon\). Classical test theory assumes that \(\epsilon_i\) is randomly distributed, but other theories (IRT) disagree. Regardless, we can move forward with the assumption that observed scores are caused by some true score and some error.
mean sd 5.5% 94.5%
Location NaN NA NA NA
Loc NaN NA NA NA
Population 6.119600e+00 6.876156e+00 0.65780 1.897690e+01
MedianAgeMarriage 2.605400e+01 1.243630e+00 24.26950 2.826100e+01
Marriage 2.011400e+01 3.797905e+00 15.20850 2.649150e+01
Marriage.SE 1.399400e+00 7.969749e-01 0.54950 2.902200e+00
Divorce 9.688000e+00 1.820814e+00 6.66950 1.273050e+01
Divorce.SE 9.618000e-01 5.253675e-01 0.34085 1.893050e+00
WaffleHouses 3.234000e+01 6.578959e+01 0.00000 1.357450e+02
South 2.800000e-01 4.535574e-01 0.00000 1.000000e+00
Slaves1860 7.937834e+04 1.497309e+05 0.00000 4.355531e+05
Population1860 6.287293e+05 7.813127e+05 0.00000 1.903357e+06
PropSlaves1860 9.405132e-02 1.744486e-01 0.00000 4.561000e-01
histogram
Location
Loc
Population ▇▃▁▁▁▁▁▁
MedianAgeMarriage ▁▁▂▂▃▇▅▃▁▁▂▁▁▁
Marriage ▁▃▇▇▇▅▂▁▁▁
Marriage.SE ▁▇▅▃▁▂▁▁
Divorce ▂▃▅▅▇▂▃▁
Divorce.SE ▂▇▇▃▃▂▁▂▂▁▁▁
WaffleHouses ▇▁▁▁▁▁▁▁
South ▇▁▁▁▁▁▁▁▁▂
Slaves1860 ▇▁▁▁▁▁▁▁▁▁
Population1860 ▇▃▂▁▁▁▁▁
PropSlaves1860 ▇▁▁▁▁▁▁▁▁▁▁▁
dag_coords <-
tibble(name = c("A", "M", "D", "Dobs", "eD"),
x = c(1, 2, 2, 3, 4),
y = c(2, 3, 1, 1, 1))
dagify(M ~ A,
D ~ A + M,
Dobs ~ D + eD,
coords = dag_coords) %>%
tidy_dagitty() %>%
mutate(color = ifelse(name %in% c("D", "eD"), "a", "b")) %>%
ggplot(aes(x = x, y = y, xend = xend, yend = yend)) +
geom_dag_point(aes(color = color),
size = 10, show.legend = F) +
geom_dag_text(parse = T, label = c("A", "D", expression(D[obs]),
"M",expression(italic(e)[D]))) +
geom_dag_edges() +
theme_void()p1 <- d %>%
ggplot(aes(x = MedianAgeMarriage,
y = Divorce,
ymin = Divorce - Divorce.SE,
ymax = Divorce + Divorce.SE)) +
geom_pointrange(shape = 20, alpha = 2/3, color="#1c5253") +
labs(x = "Median age marriage" ,
y = "Divorce rate")
p2 <-
d %>%
ggplot(aes(x = log(Population),
y = Divorce,
ymin = Divorce - Divorce.SE,
ymax = Divorce + Divorce.SE)) +
geom_pointrange(shape = 20, alpha = 2/3, color="#e07a5f") +
scale_y_continuous(NULL, breaks = NULL) +
xlab("log population")
p1 | p2How do you incorporate the standard error into the model? First, we’ll think through this logically, then we’ll use the code.
First, measurement error refers to the amount of variability we would expect to see in statistics across studies. In other words, error is the measure of the spread of a distribution. What’s unknown in this case is the mean of the distribution.
\[ D_{\text{obs},i} \sim \text{Normal}(D_i, D_{\text{SE},i}) \]
Well hey, isn’t that the kind of parameter that Bayesian analysis has been trying to estimate all along?
\[\begin{align} D_{\text{obs},i} &\sim \text{Normal}(D_i, D_{\text{SE},i}) \\ D_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha + \beta_AA_i + \beta_MM_i \\ \alpha &\sim \text{Normal}(0, 0.2) \\ \beta_A &\sim \text{Normal}(0, 0.5) \\ \beta_M &\sim \text{Normal}(0, 0.5) \\ \sigma &\sim \text{Exponential}(1) \end{align}\]
For the sake of easier priors, we’ll standardize all our variables first. Note that standardizing standard errors means keeping them in the same unit as the varible they refer to.
m1 <-
brm(data = d,
family = gaussian,
D_obs | mi(D_sd) ~ 1 + A + M,
prior = c(prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 0.5), class = b),
prior(exponential(1), class = sigma)),
iter = 2000, warmup = 1000, cores = 4, chains = 4,
seed = 15,
# note this line
save_pars = save_pars(latent = TRUE),
file = here("files/models/m92.1")) Family: gaussian
Links: mu = identity; sigma = identity
Formula: D_obs | mi(D_sd) ~ 1 + A + M
Data: d (Number of observations: 50)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -0.05 0.10 -0.24 0.14 1.00 4597 3090
A -0.62 0.16 -0.92 -0.30 1.00 3829 3423
M 0.05 0.17 -0.27 0.38 1.00 3671 3297
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.58 0.11 0.39 0.81 1.00 1682 1720
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Estimate Est.Error Q2.5 Q97.5
b_Intercept -0.05287903 0.09560518 -0.23886745 0.13726794
b_A -0.61700465 0.15906799 -0.92245919 -0.29941922
b_M 0.05250207 0.16596210 -0.27093761 0.38138566
sigma 0.58407876 0.10676964 0.38881281 0.81491478
Intercept -0.05287903 0.09560518 -0.23886745 0.13726794
Yl[1] 1.16380958 0.36399705 0.45678248 1.88019251
Yl[2] 0.68798528 0.55545971 -0.40532881 1.78866913
Yl[3] 0.42646352 0.33696468 -0.22714592 1.07800780
Yl[4] 1.41990481 0.45735398 0.51982522 2.33835163
Yl[5] -0.90062131 0.12857833 -1.15488620 -0.64688189
Yl[6] 0.65994433 0.39848512 -0.10032271 1.45295620
Yl[7] -1.36526810 0.35504558 -2.05486652 -0.66823571
Yl[8] -0.33840244 0.47247230 -1.25609775 0.58214114
Yl[9] -1.90150049 0.58817751 -3.05336435 -0.72470074
Yl[10] -0.62132190 0.17186028 -0.95587821 -0.27755085
Yl[11] 0.76341246 0.28108009 0.20629632 1.31262617
Yl[12] -0.55322097 0.48671358 -1.53032059 0.41006406
Yl[13] 0.18014145 0.50648969 -0.85327426 1.15270187
Yl[14] -0.87210178 0.22594215 -1.31849712 -0.42656000
Yl[15] 0.56087538 0.30848278 -0.03307509 1.16024922
Yl[16] 0.28374664 0.40123574 -0.53788724 1.07797302
Yl[17] 0.49757257 0.42622575 -0.34424766 1.32234855
Yl[18] 1.25343126 0.34850197 0.58866869 1.93645181
Yl[19] 0.43293128 0.38544033 -0.30272767 1.20778192
Yl[20] 0.40469307 0.54221317 -0.60265472 1.49375005
Yl[21] -0.55866095 0.31505583 -1.18076309 0.06387204
Yl[22] -1.10085094 0.26189265 -1.60983803 -0.58917295
Yl[23] -0.26811533 0.26452671 -0.78206170 0.24950601
Yl[24] -1.00154932 0.30528291 -1.62109423 -0.40641259
Yl[25] 0.43228569 0.39575018 -0.33219269 1.22640132
Yl[26] -0.03375897 0.31355089 -0.66626319 0.58620335
Yl[27] -0.01176553 0.50233364 -0.99239479 1.00288051
Yl[28] -0.14619945 0.38981782 -0.92315210 0.61324160
Yl[29] -0.26257719 0.50330444 -1.20212247 0.74789655
Yl[30] -1.80061034 0.23677345 -2.25647365 -1.33209152
Yl[31] 0.17265156 0.41087733 -0.63822136 0.99652013
Yl[32] -1.66073596 0.16260915 -1.98035225 -1.34235191
Yl[33] 0.11852901 0.24138612 -0.36122545 0.60586708
Yl[34] -0.05418447 0.51141807 -1.11458891 0.91539123
Yl[35] -0.12291325 0.22737764 -0.57408527 0.32393579
Yl[36] 1.28479965 0.40761437 0.48882609 2.11017466
Yl[37] 0.22368318 0.35026426 -0.46485333 0.91459730
Yl[38] -1.02508669 0.21833912 -1.44644004 -0.59395760
Yl[39] -0.92956225 0.54071523 -1.96249272 0.16346135
Yl[40] -0.67944819 0.32342300 -1.31483203 -0.05196159
Yl[41] 0.24407462 0.54826453 -0.81917624 1.32536575
Yl[42] 0.74595423 0.33110341 0.09928354 1.38997465
Yl[43] 0.19636347 0.18408346 -0.16880604 0.55807729
Yl[44] 0.80183926 0.42335058 -0.05159185 1.60462754
Yl[45] -0.40557248 0.52661056 -1.42459914 0.68427256
Yl[46] -0.39240888 0.25684943 -0.89652352 0.09167503
Yl[47] 0.13440823 0.30195290 -0.46337386 0.72571630
Yl[48] 0.55957284 0.44795065 -0.31515399 1.44366592
Yl[49] -0.63550849 0.27979049 -1.18884521 -0.08282839
Yl[50] 0.85003581 0.58270897 -0.33293414 1.97866945
lprior -1.36690837 0.44820845 -2.44940868 -0.71213312
lp__ -78.15515223 6.57052200 -91.46921227 -65.53798980
states <- c("AL", "AR", "ME", "NH", "RI", "DC", "VT", "AK", "SD", "UT", "ID", "ND", "WY")
d_est <-
posterior_summary(m1) %>%
data.frame() %>%
rownames_to_column("term") %>%
mutate(D_est = Estimate) %>%
select(term, D_est) %>%
filter(str_detect(term, "Yl")) %>%
bind_cols(d)
d_est %>%
ggplot(aes(x = D_sd, y = D_est - D_obs)) +
geom_hline(yintercept = 0, linetype = 2) +
geom_point(alpha = 2/3) +
geom_text_repel(data = . %>% filter(Loc %in% states),
aes(label = Loc),
size = 3, seed = 15) states <- c("AR", "ME", "RI", "ID", "WY", "ND", "MN")
as_draws_df(m1) %>%
expand_grid(A = seq(from = -3.5, to = 3.5, length.out = 50)) %>%
mutate(fitted = b_Intercept + b_A * A) %>%
ggplot(aes(x = A)) +
stat_lineribbon(aes(y = fitted),
.width = .95, size = 1/3,
color = "grey20",
fill = "grey80") +
geom_segment(data = d_est,
aes(xend = A, y = D_obs, yend = D_est),
linewidth = 1/5) +
geom_point(data = d_est,
aes(y = D_obs)) +
geom_point(data = d_est,
aes(y = D_est),
shape = 1, stroke = 1/3) +
geom_text_repel(data = d %>% filter(Loc %in% states),
aes(y = D_obs, label = Loc),
size = 3, seed = 15) +
labs(x = "median age marriage (std)",
y = "divorce rate (std)") +
coord_cartesian(xlim = range(d$A),
ylim = range(d$D_obs))- incorporating measurement error into your model results in regularization: observations associated with more error are less influential on the model, and the estimated true score of those points can be (but are not necessarily) quite different from the observation.
- this will result in a model that is more conservative but one you can have more faith in.
- of course, there's an obvious application of this kind of modeling...
The goal of a meta-analysis is estimating meaningful parameters. A set of studies that are broadly comparable are either…
1. identical replications of each other in that all studies are identical samples of the same population with the same outcome measures, etc, or
2. so different that the results of any one study provide no information about the results of any of the others, or
3. the studies as exchangeable but not necessarily either identical or completely unrelated; in other words we allow differences from study to study, but such that the differences are not expected a priori to have predictable effects favoring one study over another.
In other words, you can answer the question with complete pooling, no pooling, or partial pooling. And generally speaking, the third of these is probably the one you want.
Our data come from the second large-scale replication project by the Many Labs team (Klein et al., 2018). Of the 28 studies replicated in the study, we will focus on the replication of the trolley experiment from Hauser et al. (2007).
From Klein and colleagues:
According to the principle of double effect, an act that harms other people is more morally permissible if the act is a foreseen side effect rather than the means to the greater good. Hauser et al. (2007) compared participants’ reactions to two scenarios to test whether their judgments followed this principle. In the foreseen-side-effect scenario, a person on an out-of-control train changed the train’s trajectory so that the train killed one person instead of five. In the greater-good scenario, a person pushed a fat man in front of a train, killing him, to save five people. Whereas 89% of participants judged the action in the foreseen-side-effect scenario as permissible (95% CI = [87%, 91%]), only 11% of participants in the greater-good scenario judged it as permissible (95% CI = [9%,13%]). The difference between the percentages was significant, \(χ2(1, N=2,646) = 1,615.96\), \(p<.001\), \(w=.78\), \(d=2.50\), 95%CI=[2.22,2.86]. Thus, the results provided evidence for the principle of double effect. (p. 459, emphasis in the original)
You can download the data from OSF. You want the .csv file in Trolley Dilemma 1 > Hauser.1 > By Order > Data > Hauser_1_study_by_order_all_CLEAN_CASE.csv
Rows: 6,842
Columns: 27
$ uID <dbl> 65, 68, 102, 126, 145, 263, 267, 298, 309, 318, 350, …
$ variable <chr> "Yes", "Yes", "Yes", "Yes", "Yes", "No", "Yes", "Yes"…
$ factor <chr> "SideEffect", "SideEffect", "SideEffect", "SideEffect…
$ .id <chr> "ML2_Slate1_Brazil__Portuguese_execution_illegal_r.cs…
$ source <chr> "brasilia", "brasilia", "brasilia", "wilfredlaur", "w…
$ haus1.1 <dbl> 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1,…
$ haus1.1t_1 <dbl> 39.054, 36.792, 56.493, 21.908, 25.635, 50.633, 58.66…
$ haus2.1 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ haus2.1t_1 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ Source.Global <chr> "brasilia", "brasilia", "brasilia", "wilfredlaur", "w…
$ Source.Primary <chr> "brasilia", "brasilia", "brasilia", "wilfredlaur", "w…
$ Source.Secondary <chr> "brasilia", "brasilia", "brasilia", "wilfredlaur", "w…
$ Country <chr> "Brazil", "Brazil", "Brazil", "Canada", "Canada", "Ca…
$ Location <chr> "Social and Work Psychology Department, University of…
$ Language <chr> "Portuguese", "Portuguese", "Portuguese", "English", …
$ Weird <dbl> 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ Execution <chr> "illegal", "illegal", "illegal", "illegal", "illegal"…
$ SubjectPool <chr> "No", "No", "No", "Yes", "Yes", "Yes", "Yes", "Yes", …
$ Setting <chr> "In a classroom", "In a classroom", "In a classroom",…
$ Tablet <chr> "Computers", "Computers", "Computers", "Computers", "…
$ Pencil <chr> "No, the whole study was on the computer (except mayb…
$ StudyOrderN <chr> "Hauser|Ross.Slate1|Rottenstrich|Graham|Kay|Inbar|And…
$ IDiffOrderN <chr> "ID: Global self-esteem SISE|ID: Mood|ID: Subjective …
$ study.order <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ analysis.type <chr> "Order", "Order", "Order", "Order", "Order", "Order",…
$ subset <chr> "all", "all", "all", "all", "all", "all", "all", "all…
$ case.include <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,…
# A tibble: 59 × 2
Location n
<chr> <int>
1 University of Toronto, Scarborough 325
2 MTurk India Workers 308
3 MTurk US Workers 304
4 University of Illinois at Urbana-Champaign, Champaign, IL 198
5 Eotvos Lorand University, in Budapest, Hungary 180
6 Department of Social Psychology, Tilburg University, P.O. Box 90153, T… 173
7 Department of Psychology, San Diego State University, San Diego, CA 92… 171
8 Department of Psychology, Pennsylvania State University Abington, Abin… 166
9 American University of Sharjah, United Arab Emirates 162
10 University of British Columbia, Vancouver, Canada 147
# ℹ 49 more rows
# A tibble: 59 × 2
Location n
<chr> <int>
1 badania.net 34
2 Open University of Hong Kong 36
3 Universidade do Porto, Portugal 36
4 Université de Poitiers, France 44
5 University of Navarra, Spain 48
6 Santiago, Chile 59
7 University of Johannesburg, Johanneburg, South Africa 70
8 Koç University, Istanbul, Turkey 74
9 Department of Psychology, Ithaca College, Ithaca, NY 14850 76
10 Tufts 79
# ℹ 49 more rows